Sifting abstracts from Medline and evaluating their relevance to molecular biology

نویسندگان

  • Natalia Grabar
  • Marie-Christine Jaulent
  • Antoine Chambaz
  • Céline Lefebvre
  • Christian Néri
چکیده

The most important knowledge in the area of biology currently consists of raw text documents. Bibliographic databases of biomedical articles can be searched, but an efficient procedure should evaluate the relevance of documents to biology. In genetics, this challenge is even trickier, because of the lack of consistency in genes' naming tradition. We aim to define a good approach for collecting relevant abstracts for biology and for studied species and genes. Our approach relies on defining best queries, detecting and filtering best sources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ontology Based Corpus Annotation and Tools

With the explosion of results in molecular biology there is an increased need for IE to extract knowledge to support database building and to search intelligently for information in online journal collections. We aim to build information extraction systems from biology papers and their abstracts available from the MEDLINE database[1, 3]. As a part of a project on information extraction from the...

متن کامل

Towards Retrieving Relevant Information for Answering Clinical Comparison Questions

This paper introduces the task of automatically answering clinical comparison questions using MEDLINE abstracts. In the beginning, clinical comparison questions and the main challenges in recognising and extracting their components are described. Then, different strategies for retrieving MEDLINE abstracts are shown. Finally, the results of an initial experiment judging the relevance of MEDLINE ...

متن کامل

Part-of-Speech Tagging in Molecular Biology Scientific Abstracts Using Morphological and Contextual Statistical Information

In this paper a probabilistic tagger for molecular biology related abstracts is presented and evaluated. The system consists of three modules: a rule based molecular-biology names detector, an unknown words handler, and a Hidden Markov model based tagger which are used to annotate the corpus with an extended set of grammatical and molecular biology tags. The complete system has been evaluated u...

متن کامل

Mining molecular binding terminology from biomedical text

Automatic access to information regarding macromolecular binding relationships would provide a valuable resource to the biomedical community. We report on a pilot project to mine such information from the molecular biology literature. The program being developed takes advantage of natural language processing techniques and is supported by two repositories of biomolecular knowledge. A formative ...

متن کامل

Extracting the Names of Genes and Gene Products with a Hidden Markov Model

We report the results of a study into the use of a linear interpolating hidden Markov model (HMM) for the task of extracting technical terminology from MEDLINE abstracts and texts in the molecular-biology domain. This is the rst stage in a system that will extract event information for automatically updating biology databases. We trained the HMM entirely with bigrams based on lexical and charac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Studies in health technology and informatics

دوره 124  شماره 

صفحات  -

تاریخ انتشار 2006